15 research outputs found
Fast, Dense Feature SDM on an iPhone
In this paper, we present our method for enabling dense SDM to run at over 90
FPS on a mobile device. Our contributions are two-fold. Drawing inspiration
from the FFT, we propose a Sparse Compositional Regression (SCR) framework,
which enables a significant speed up over classical dense regressors. Second,
we propose a binary approximation to SIFT features. Binary Approximated SIFT
(BASIFT) features, which are a computationally efficient approximation to SIFT,
a commonly used feature with SDM. We demonstrate the performance of our
algorithm on an iPhone 7, and show that we achieve similar accuracy to SDM
Learning Background-Aware Correlation Filters for Visual Tracking
Correlation Filters (CFs) have recently demonstrated excellent performance in
terms of rapidly tracking objects under challenging photometric and geometric
variations. The strength of the approach comes from its ability to efficiently
learn - "on the fly" - how the object is changing over time. A fundamental
drawback to CFs, however, is that the background of the object is not be
modelled over time which can result in suboptimal results. In this paper we
propose a Background-Aware CF that can model how both the foreground and
background of the object varies over time. Our approach, like conventional CFs,
is extremely computationally efficient - and extensive experiments over
multiple tracking benchmarks demonstrate the superior accuracy and real-time
performance of our method compared to the state-of-the-art trackers including
those based on a deep learning paradigm
An intelligent vision system for wildlife monitoring
The understanding of animal behaviour in response to human development is vital for sustainable management of ecosystems. Existing methods of monitoring wildlife
activity fall short in facets pertaining to accuracy, accessibility, cost and practicality.
To address this level of crudity, recent technological advances have led to the development of electronic, autonomous wildlife monitoring solutions. Whilst these developments improve the overall experience in some areas, there is still much to be desired. This dissertation aims to outline the development of an accessible, affordable, intelligent vision-based technique which addresses limitations of existing monitoring methods.
A signal processing methodology was investigated, developed and implemented. The development of this methodology included the investigation of two distinct facets of
computer vision - image segmentation and event classification.
The existing literature was explored, and several image segmentation techniques were explored. Upon further investigation, the Gaussian Mixtures Model was selected in two forms - per pixel modelling (Zivkovic 2004) and a compressive sensing based method(Shen, Hu, Yang, Wei & Chou 2012). Each method was evaluated in terms of real time
capabilities and accuracy to provide basis for recommendation of the method presented in the prototype.
Upon evaluation, it was discovered that the proposed compressive sensing based method was a suitable prototype and recommendations regarding the implementation and com-
missioning of the system were made. Furthermore, possible avenues for further research and development were explored
Why capture frame rate matters for embedded vision
This thesis examines the practical challenges of reliable object and facial tracking on mobile devices. We investigate the capabilities of such devices and propose a number of strategies to leverage the hardware and architectural strengths offered by smartphones and other embedded systems. We show how high frame rate cameras can be used as a resource to trade off algorithmic complexity while still achieving reliable, real time tracking performance. We also propose a number of strategies for formulating tracking algorithms, which make better use of the architectural redundancies inherent to modern system-on-chips
Need for speed: A benchmark for higher frame rate object tracking
In this paper, we propose the first higher frame rate video dataset (called
Need for Speed - NfS) and benchmark for visual object tracking. The dataset
consists of 100 videos (380K frames) captured with now commonly available
higher frame rate (240 FPS) cameras from real world scenarios. All frames are
annotated with axis aligned bounding boxes and all sequences are manually
labelled with nine visual attributes - such as occlusion, fast motion,
background clutter, etc. Our benchmark provides an extensive evaluation of many
recent and state-of-the-art trackers on higher frame rate sequences. We ranked
each of these trackers according to their tracking accuracy and real-time
performance. One of our surprising conclusions is that at higher frame rates,
simple trackers such as correlation filters outperform complex methods based on
deep networks. This suggests that for practical applications (such as in
robotics or embedded vision), one needs to carefully tradeoff bandwidth
constraints associated with higher frame rate acquisition, computational costs
of real-time analysis, and the required application accuracy. Our dataset and
benchmark allows for the first time (to our knowledge) systematic exploration
of such issues, and will be made available to allow for further research in
this space